NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

OpenAssert: Towards Secure Assertion Generation using Large Language Models

https://doi.org/10.1109/VTS65138.2025.11022798

Menon, Anand; Miftah, Samit Shahnawaz; Srivastava, Amisha; Kundu, Shamik; Kundu, Souvik; Raha, Arnab; Banerjee, Suvadeep; Mathaikutty, Deepak; Basu, Kanad (April 2025, IEEE)

Free, publicly-accessible full text available April 28, 2026
Towards Real-Time LLM Inference on Heterogeneous Edge Platforms

https://doi.org/10.1109/HiPCW63042.2024.00076

Jayanth, Rakshith; Gupta, Neelesh; Kundu, Souvik; Mathaikutty, Deepak A; Prasanna, Viktor (December 2024, IEEE)

Full Text Available
LAMB: A Training-Free Method to Enhance the Long-Context Understanding of SSMs via Attention-Guided Token Filtering

https://doi.org/10.18653/v1/2025.acl-short.96

Ye, Zhifan; Wang, Zheng; Xia, Kejing; Hong, Jihoon; Li, Leshu; Whalen, Lexington; Wan, Cheng; Fu, Yonggan; Lin, Yingyan Celine; Kundu, Souvik (January 2025, Association for Computational Linguistics)

Full Text Available
Don’t Just Prune by Magnitude! Your Mask Topology is Another Secret Weapon

Hoang, Duc; Kundu, Souvik; Liu, Shiwei; Wang, Zhangyang (December 2023, Advances in neural information processing systems)

Recent years have witnessed significant progress in understanding the relationship between the connectivity of a deep network's architecture as a graph, and the network's performance. A few prior arts connected deep architectures to expander graphs or Ramanujan graphs, and particularly,[7] demonstrated the use of such graph connectivity measures with ranking and relative performance of various obtained sparse sub-networks (i.e. models with prune masks) without the need for training. However, no prior work explicitly explores the role of parameters in the graph's connectivity, making the graph-based understanding of prune masks and the magnitude/gradient-based pruning practice isolated from one another. This paper strives to fill in this gap, by analyzing the Weighted Spectral Gap of Ramanujan structures in sparse neural networks and investigates its correlation with final performance. We specifically examine the evolution of sparse structures under a popular dynamic sparse-to-sparse network training scheme, and intriguingly find that the generated random topologies inherently maximize Ramanujan graphs. We also identify a strong correlation between masks, performance, and the weighted spectral gap. Leveraging this observation, we propose to construct a new "full-spectrum coordinate'' aiming to comprehensively characterize a sparse neural network's promise. Concretely, it consists of the classical Ramanujan's gap (structure), our proposed weighted spectral gap (parameters), and the constituent nested regular graphs within. In this new coordinate system, a sparse subnetwork's L2-distance from its original initialization is found to have nearly linear correlated with its performance. Eventually, we apply this unified perspective to develop a new actionable pruning method, by sampling sparse masks to maximize the L2-coordinate distance. Our method can be augmented with the "pruning at initialization" (PaI) method, and significantly outperforms existing PaI methods. With only a few iterations of training (e.g 500 iterations), we can get LTH-comparable performance as that yielded via "pruning after training", significantly saving pre-training costs. Codes can be found at: https://github.com/VITA-Group/FullSpectrum-PAI.
more » « less
Full Text Available
Plasmonic Optical Fiber Based Continuous in-Vivo Glucose Monitoring for ICU/CCU Setup

https://doi.org/10.1109/TNB.2023.3303345

Kundu, Souvik; Tabassum, Shawana; Kumar, Ritwesh A.; Abel, E. Dale; Kumar, Ratnesh (January 2024, IEEE Transactions on NanoBioscience)

Full Text Available
ACE-SNN: Algorithm-Hardware Co-design of Energy-Efficient & Low-Latency Deep Spiking Neural Networks for 3D Image Recognition

https://doi.org/10.3389/fnins.2022.815258

Datta, Gourav; Kundu, Souvik; Jaiswal, Akhilesh R.; Beerel, Peter A. (April 2022, Frontiers in Neuroscience)

High-quality 3D image recognition is an important component of many vision and robotics systems. However, the accurate processing of these images requires the use of compute-expensive 3D Convolutional Neural Networks (CNNs). To address this challenge, we propose the use of Spiking Neural Networks (SNNs) that are generated from iso-architecture CNNs and trained with quantization-aware gradient descent to optimize their weights, membrane leak, and firing thresholds. During both training and inference, the analog pixel values of a 3D image are directly applied to the input layer of the SNN without the need to convert to a spike-train. This significantly reduces the training and inference latency and results in high degree of activation sparsity, which yields significant improvements in computational efficiency. However, this introduces energy-hungry digital multiplications in the first layer of our models, which we propose to mitigate using a processing-in-memory (PIM) architecture. To evaluate our proposal, we propose a 3D and a 3D/2D hybrid SNN-compatible convolutional architecture and choose hyperspectral imaging (HSI) as an application for 3D image recognition. We achieve overall test accuracy of 98.68, 99.50, and 97.95% with 5 time steps (inference latency) and 6-bit weight quantization on the Indian Pines, Pavia University, and Salinas Scene datasets, respectively. In particular, our models implemented using standard digital hardware achieved accuracies similar to state-of-the-art (SOTA) with ~560.6× and ~44.8× less average energy than an iso-architecture full-precision and 6-bit quantized CNN, respectively. Adopting the PIM architecture in the first layer, further improves the average energy, delay, and energy-delay-product (EDP) by 30, 7, and 38%, respectively.
more » « less
Full Text Available
Plasmonic Point-of-Care Device for Sepsis Biomarker Detection

https://doi.org/10.1109/JSEN.2021.3088117

Kundu, Souvik; Tabassum, Shawana; Kumar, Ratnesh (September 2021, IEEE Sensors Journal)

Full Text Available
DNR: A Tunable Robust Pruning Framework Through Dynamic Network Rewiring of DNNs

https://doi.org/10.1145/3394885.3431542

Kundu, Souvik; Nazemi, Mahdi; Beerel, Peter A.; Pedram, Massoud (January 2021, the 26th Asia and South Pacific Design Automation Conference)
null (Ed.)
Full Text Available
DNR: A Tunable Robust Pruning Framework Through Dynamic Network Rewiring of DNNs

Kundu, Souvik; Nazemi, Mahdi; Beerel, Peter A; Pedram, Massoud (January 2021, 26th Asia and South Pacific Design Automation Conference)
null (Ed.)
This paper presents a dynamic network rewiring (DNR) method to generate pruned deep neural network (DNN) models that are robust against adversarial attacks yet maintain high accuracy on clean im- ages. In particular, the disclosed DNR method is based on a unified constrained optimization formulation using a hybrid loss function that merges ultra-high model compression with robust adversar- ial training. This training strategy dynamically adjusts inter-layer connectivity based on per-layer normalized momentum computed from the hybrid loss function. In contrast to existing robust pruning frameworks that require multiple training iterations, the proposed learning strategy achieves an overall target pruning ratio with only a single training iteration and can be tuned to support both irregu- lar and structured channel pruning. To evaluate the merits of DNR, experiments were performed with two widely accepted models, namely VGG16 and ResNet-18, on CIFAR-10, CIFAR-100 as well as with VGG16 on Tiny-ImageNet. Compared to the baseline un- compressed models, DNR provides over 20× compression on all the datasets with no significant drop in either clean or adversarial classification accuracy. Moreover, our experiments show that DNR consistently finds compressed models with better clean and adver- sarial image classification performance than what is achievable through state-of-the-art alternatives. Our models and test codes are available at https://github.com/ksouvik52/DNR_ASP_DAC2021.
more » « less
Full Text Available
Pre-defined Sparsity for Low-Complexity Convolutional Neural Networks

https://doi.org/10.1109/TC.2020.2972520

Kundu, Souvik; Nazemi, Mahdi; Pedram, Massoud; Chugg, Keith M.; Beeral, Peter (February 2020, IEEE Transactions on Computers)

The high energy cost of processing deep convolutional neural networks impedes their ubiquitous deployment in energy-constrained platforms such as embedded systems and IoT devices. This article introduces convolutional layers with pre-defined sparse 2D kernels that have support sets that repeat periodically within and across filters. Due to the efficient storage of our periodic sparse kernels, the parameter savings can translate into considerable improvements in energy efficiency due to reduced DRAM accesses, thus promising significant improvements in the trade-off between energy consumption and accuracy for both training and inference. To evaluate this approach, we performed experiments with two widely accepted datasets, CIFAR-10 and Tiny ImageNet in sparse variants of the ResNet18 and VGG16 architectures. Compared to baseline models, our proposed sparse variants require up to ∼82% fewer model parameters with 5.6× fewer FLOPs with negligible loss in accuracy for ResNet18 on CIFAR-10. For VGG16 trained on Tiny ImageNet, our approach requires 5.8× fewer FLOPs and up to ∼83.3% fewer model parameters with a drop in top-5 (top-1) accuracy of only 1.2% ( ∼2.1% ). We also compared the performance of our proposed architectures with that of ShuffleNet and MobileNetV2. Using similar hyperparameters and FLOPs, our ResNet18 variants yield an average accuracy improvement of ∼2.8% .
more » « less
Full Text Available

« Prev Next »

Search for: All records